The development of cryptocurrency as a mainstream financial instrument has become more and more apparent within the last 10 years. As such a comparison of its behaviours, changes and trends to traditional financial instruments may be examined. In this project a comparison of multiple cryptocurrencies and stock prices from the New York Stock Exchange will be undertaken in order to investigate the potential underlying causes of these changes.
I also have personal interest in the factors of influence specifically for Bitcoin. I bought €200 worth of Bitcoin in 2014 and promptly spent it on the Dark web. If sold at peak this would have been worth ~€120,000. As such, it is of interest to understand why my frivolous spending had the opportunity cost of a down payment on a mortgage.
library(BatchGetSymbols)
library(yahoofinancer)
library(TTR)
library(quantmod)
library(psych)
library(PerformanceAnalytics)
library(pastecs)
library(heatmaply)
library(vars)
library(ggm)
library(tseries)
library(DescTools)
library(moments)
library(rugarch)
First, a list of traditional financial assets must be selected for analysis.
The selected Companies for this Analysis are as follows:
Tesla (TSLA) - Tesla is an automotive and energy company based in the USA. With a market cap of ~$550B Tesla is one of the most prominent companies in this industry. Also to note is the CEO and minority owner of Tesla, Elon Musk, frequently engages in social media as a method of stock price manipulation (Reuters, 2022).
Amazon (AMZN) - Amazon has been referred to as “one of the most influential economic and cultural forces worldwide” (Jacoby, 2020). With a market cap of ~$1.1T, Amazon is one of the most valuable companies in the world.
Apple (AAPL) - The largest company in the world by revenue. Apple is a tech giant that as of March 2023 has the largest market capitalisation in the world.
Meta (META) - Formerly Facebook, Meta is a tech conglomerate specialising in social media. Notably they own Facebook, Instagram and Whatsapp. With a market cap of ~$600B they are one of the ten largest companies in the world by market valuation.
Google (GOOG) - Google is one of the pioneers of surveillance capitalism in the information age (Zuboff, 2019). After going public in 2004, google has grown to be one of the largest companies in the world by revenue and market capitalisation.
Netflix (NFLX) - Netflix is one of the world’s leading online streaming platforms. With a leading market share in the US of 44.21% for the streaming industry, Netflix competes directly with the likes of Amazon and Google in this industry (Pandy, 2023). As such it is selected to see if there is a prominent difference in pricing behaviour compared to the other global conglomerates selected.
Microsoft (MSFT) - Another leading company in the tech industry. Microsoft has a current market capitalisation of ~2.3T.
The historical financial data for each of these stocks was extracted using yahoo finance, specifically for the dates of 2015-01-01 to 2021-07-07 for similar scope to the available data on the corresponding cryptocurrency data.
The close prices of the stocks was extracted and combined to a single data frame and then the returns for each stock was calculated using the Returns.calculate function.
tickers <- c('TSLA', 'AMZN', 'AAPL', 'META', 'GOOG', 'NFLX', 'MSFT')
getSymbols(tickers,
src='yahoo',
from = "2015-01-01",
to = "2021-07-07",
periodicity = 'daily')
[1] "TSLA" "AMZN" "AAPL" "META" "GOOG" "NFLX" "MSFT"
## extracting close prices and binding to one time series
stock.prices <- cbind(TSLA$TSLA.Close,
AMZN$AMZN.Close,
AAPL$AAPL.Close,
META$META.Close,
GOOG$GOOG.Close,
NFLX$NFLX.Close,
MSFT$MSFT.Close)
## calculating returns of stock prices
stock.returns <- Return.calculate(stock.prices)
colnames(stock.returns) <- tickers
colnames(stock.prices) <- tickers
head(stock.returns)
TSLA AMZN AAPL META GOOG NFLX MSFT
2015-01-02 NA NA NA NA NA NA NA
2015-01-05 -0.042041028 -0.020517290 -2.817161e-02 -0.016061116 -0.020845616 -0.050897019 -0.009195819
2015-01-06 0.005664237 -0.022833335 9.413775e-05 -0.013473259 -0.023177093 -0.017120548 -0.014677321
2015-01-07 -0.001561931 0.010599740 1.402219e-02 0.000000000 -0.001713234 0.005191848 0.012705323
2015-01-08 -0.001564306 0.006836019 3.842227e-02 0.026657895 0.003153037 0.022188200 0.029418140
2015-01-09 -0.018801629 -0.011748610 1.072506e-03 -0.005628069 -0.012950547 -0.015457748 -0.008405159
tail(stock.returns)
TSLA AMZN AAPL META GOOG NFLX MSFT
2021-06-28 0.025079266 0.012474089 0.012546001 0.0418022080 -0.001381936 0.0113078375 1.396126e-02
2021-06-29 -0.011557682 0.001234034 0.011500245 -0.0105443703 -0.006316083 0.0008816965 9.973179e-03
2021-06-30 -0.001557080 -0.002314303 0.004621176 -0.0118787775 -0.005574573 -0.0099156102 -1.842299e-03
2021-07-01 -0.002618823 -0.002090002 0.002263417 0.0192114790 0.008398751 0.0100906000 2.584024e-03
2021-07-02 0.001445637 0.022723749 0.019596433 0.0008747356 0.018600319 0.0008246850 2.227536e-02
2021-07-06 -0.028457810 0.046927107 0.014718473 -0.0054130628 0.008172857 0.0143451711 3.605174e-05
## descriptive statistics for stock prices
stat.desc(stock.returns)
The cryptocurrencies selected for analysis are as follows:
Bitcoin (BTC) - Bitcoin is the largest and most famous cryptocurrency in the world. Invented in 2008 by a person or group known as “Satoshi Nakamoto”, Bitcoin has gone from being a fringe form of currency to one of the largest
Dogecoin (DOGE) - Dogecoin is an “altcoin” invented in 2013. One of the first “memecoins”, Dogecoin is considered a satirical coin by it’s creators. However it has gained large market traction and it’s peak market capitalisation was over $85B in May 2021. Dogecoin has been used in many satirical online movements, such as sponsorship of the Jamaican Bobsled team and recently being the sleeve sponsor of Watford Football club (Hern, 2014).
Ethereum (ETH) - Ethereum is the second largest cryptocurreny by market capitalisation in the world. A “smart” cryptocurrency, it is selected for analysis to investigate the similarities or differences it has to Bitcoin.
Litecoin (LTC) - The “Lite” version of Bitcoin. Litecoin is named as such because it facilitates the mining of the coin much more than the demanding mining algorithm of Bitcoin. The mining algorithm for Litecoin known as “scrypt” facilitates the use of less powerful hardware, making it more accessible to a greater portion of it’s users.
Monero (MONERO) - Monero has a strong focus on privacy and anonymity. The main focus of Monero is an inability for the transaction to be traced or identified. Monero is selected for analysis as it has been accepted hesitantly by the cryptocurrency community and as such may have different pricing behaviours (Sephton, 2022).
The dataset available from Kaggle has complete data only from 2015-08-08 to 2021-06-07.
## loading in crypto prices from csv
crypto.prices <- read.csv('crypto.prices.csv')
## specifying dates
crypto.prices$Date <- as.POSIXct(crypto.prices$Date, format = "%d/%m/%Y %H:%M")
## removing the hours and minutes
crypto.prices$Date <- as.Date(format(crypto.prices$Date,
format = '%Y-%m-%d'))
## specifying times series
crypto.prices.xts <- xts(crypto.prices[,-1], order.by = crypto.prices$Date)
## calculating returns of crypto prices
crypto.returns <- Return.calculate(crypto.prices)
## descriptive statistics for crypto prices
stat.desc(crypto.returns)
Next the two times series of stocks and crypto returns are merged and any na’s within the time series are omitted. This gives a time series complete for Monday-Friday for the time period of 2015-10-26 to 2021-03-26.
The descriptive statistics for these times series show Dogecoin had the greatest return over the entire period while Meta had the lowest. This is reflected in the plot of the price of Dogecoin with an extreme growth period in mid 2021. Meanwhile Meta mainted a slower upward trend over the period measured. This is also reflected in the volatility of each stock, as Dogecoin has the highest variance by almost an order of magnitude compared to the variance of the other returns. Google had the lowest variance but was comparable to the variance of microsoft and to a lesser extent Amazon, Apple and Meta. The cryptocurrencies all had higher variance than the stock returns, with Bitcoin having the lowest of the cryptocurrencies (~0.0023) compared to Tesla which had the highest of the stocks (~0.0013).
The difference in volatilities in the returns of the cryptocurrencies versus the stocks would indicate the level of risk associated with each. A risk loving short term investor would be more suited to invest in the cryptocurrencies whereas a risk averse investment would be better suited to the stocks.
Although as we see from average return for this period, the mean return is higher for the cryptocurrencies which does indicate a suitable increase in return as compensation for the higher risk.
This finding may lead future investment strategy towards cryptocurrencies versus these stocks for a higher risk but higher return strategy. Notably the stocks may be used as a portfolio component to mitigate some risk of an investment exclusively in cryptocurrencies.
## merging stock data and crypto data
prices <- merge(stock.prices, crypto.prices)
total.df <- na.omit(prices)
returns <- Return.calculate(total.df)
## final time series
returns <- na.omit(returns)
head(returns)
TSLA AMZN AAPL META GOOG NFLX MSFT BTC
2015-08-11 -0.015634065 0.0066029933 -0.0520381164 -0.005629302 0.042683821 -0.002357156 -0.0194380297 0.022369210
2015-08-12 0.003370263 -0.0029385884 0.0154198611 0.006088439 -0.001846328 -0.018168452 0.0071105760 -0.014830573
2015-08-13 0.018222279 0.0071304977 -0.0007809471 -0.008068820 -0.004715335 0.026719784 -0.0002139952 -0.008619472
2015-08-14 0.002638993 0.0035117023 0.0070342818 0.010596145 0.001020662 -0.002747951 0.0057778827 0.006058793
2015-08-17 0.048694225 0.0069611478 0.0103484356 -0.005189556 0.005706720 0.015965647 0.0068085041 -0.028997214
2015-08-18 0.022471469 -0.0003736867 -0.0056333530 0.013201297 -0.007172349 -0.010449885 -0.0010566196 -0.181788290
DOGE ETH LTC MONERO
2015-08-11 0.022919753 0.507323075 0.05334936 0.032043685
2015-08-12 -0.020774777 0.140074543 -0.04185553 -0.075559899
2015-08-13 -0.024659928 0.501240278 -0.02637342 0.059131205
2015-08-14 0.008950278 0.000109447 0.03759716 0.005459844
2015-08-17 -0.048044792 -0.341523229 -0.01049864 -0.195604989
2015-08-18 -0.108059682 -0.096841990 -0.12908006 0.082745483
summary(returns)
Index TSLA AMZN AAPL META GOOG
Min. :2015-08-11 Min. :-0.210628 Min. :-0.079221 Min. :-0.1286470 Min. :-0.189609 Min. :-0.111008
1st Qu.:2017-01-31 1st Qu.:-0.014823 1st Qu.:-0.007246 1st Qu.:-0.0068831 1st Qu.:-0.007754 1st Qu.:-0.005844
Median :2018-07-23 Median : 0.001259 Median : 0.001484 Median : 0.0009011 Median : 0.001191 Median : 0.001176
Mean :2018-07-22 Mean : 0.002429 Mean : 0.001460 Mean : 0.0012179 Mean : 0.001105 Mean : 0.001085
3rd Qu.:2020-01-13 3rd Qu.: 0.019023 3rd Qu.: 0.010787 3rd Qu.: 0.0102688 3rd Qu.: 0.011293 3rd Qu.: 0.008987
Max. :2021-07-02 Max. : 0.198949 Max. : 0.132164 Max. : 0.1198083 Max. : 0.155214 Max. : 0.104485
NFLX MSFT BTC DOGE ETH LTC
Min. :-0.131262 Min. :-0.147390 Min. :-0.371695 Min. :-0.477473 Min. :-0.4234722 Min. :-0.3617733
1st Qu.:-0.011413 1st Qu.:-0.005932 1st Qu.:-0.013785 1st Qu.:-0.024094 1st Qu.:-0.0263984 1st Qu.:-0.0228629
Median : 0.000476 Median : 0.001101 Median : 0.002932 Median :-0.001135 Median : 0.0007199 Median : 0.0001892
Mean : 0.001325 Mean : 0.001342 Mean : 0.004394 Mean : 0.010358 Mean : 0.0083869 Mean : 0.0046698
3rd Qu.: 0.014689 3rd Qu.: 0.009439 3rd Qu.: 0.022969 3rd Qu.: 0.021219 3rd Qu.: 0.0360797 3rd Qu.: 0.0254131
Max. : 0.190281 Max. : 0.142169 Max. : 0.252472 Max. : 3.555712 Max. : 0.6647704 Max. : 0.7157378
MONERO
Min. :-0.413860
1st Qu.:-0.027160
Median : 0.001346
Mean : 0.006930
3rd Qu.: 0.035759
Max. : 0.933872
stat.desc(returns)
NA
plot(crypto.prices$DOGE, type="l")
plot(stock.prices$META)
plot(stock.prices$GOOG)
plot(crypto.prices$BTC, type="l")
plot(stock.prices$TSLA)
plot(stock.prices)
Plots of the returns can be inspected to see if there is presence of volatility clustering.
There does seem to be some evidence of volatility clustering but nothing conclusive can be drawn from this plot.
plot(returns)
Looking more in depth at some location metrics for the returns of the assets, there seems to be little difference between the mean and the trimmed mean for the traditional stocks, whereas there is a significant change for all cryptocurrencies except Bitcoin. These indicates much higher volatility of returns for the cryptos whereas the traditional assets seem to be more stable.
location.mean <- sapply(returns, mean)
location.mean.t <- sapply(returns, mean, trim=0.1)
location.median <- sapply(returns, median)
location <- rbind(location.mean, location.mean.t, location.median)
print(location)
TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE
location.mean 0.002429127 0.001459665 0.0012179065 0.001104630 0.001084958 0.0013251163 0.001342072 0.004393800 1.035764e-02
location.mean.t 0.001689727 0.001524434 0.0013737315 0.001337789 0.001406672 0.0013097276 0.001451763 0.004114200 -5.126234e-05
location.median 0.001258703 0.001484485 0.0009011348 0.001191094 0.001176446 0.0004759975 0.001100711 0.002931473 -1.134800e-03
ETH LTC MONERO
location.mean 0.0083868818 0.0046697729 0.006929797
location.mean.t 0.0041465206 0.0014823799 0.003225751
location.median 0.0007198898 0.0001892313 0.001345997
Next, some metrics of variability are examined to distinguish further between the volatility of the returns of the selected assets.
The stock had reasonably consistent standard deviation of returns indicating consistent volatility. Tesla did however have the highest standard deviation of all traditional stocks, which was seen in the plot of the stock prices showing Tesla’s price as slightly more erratic.
The cryptocurrencies by comparison have a much higher standard deviation indicating higher volatility of returns. Dogecoin in particular had a standard deviation an order of magnitude higher than the traditional stocks. This gives further credence to the notioin of these cryptocurrencies being a higher risk investment than the traditional stocks.
var.MeanAD <- sapply(returns, MeanAD)
var.variance <- sapply(returns, var)
var.sd <- sapply(returns, sd)
var.MedAD <- sapply(returns, mad)
variability <- rbind(var.MeanAD, var.variance, var.sd, var.MedAD)
print(variability)
TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE
var.MeanAD 0.024496827 0.0130560955 0.0127963328 0.0136472764 0.0113103943 0.0180904640 0.011468227 0.030729717 0.05260454
var.variance 0.001300085 0.0003573921 0.0003570285 0.0004198063 0.0002813272 0.0006752323 0.000300276 0.002226247 0.01818834
var.sd 0.036056689 0.0189048167 0.0188951990 0.0204891757 0.0167728102 0.0259852320 0.017328473 0.047183121 0.13486415
var.MedAD 0.025077563 0.0133252262 0.0126415199 0.0142668564 0.0110658092 0.0190235250 0.011312663 0.026883653 0.03368743
ETH LTC MONERO
var.MeanAD 0.050856080 0.042205183 0.050268417
var.variance 0.006273657 0.004837048 0.006405274
var.sd 0.079206421 0.069548890 0.080032956
var.MedAD 0.045342243 0.035928548 0.046607493
The Box plot shows the presence of huge outliers for the cryptocurrencies compared to the stocks. Dogecoin in particular has an extreme outlier.
boxplot(returns, horizontal=TRUE, main="Returns")
Looking just at the stock returns, the outliers are on a much smaller scale. Tesla again can be seen to have greater variability through it’s larger inter-quartile range and larger ranging outliers.
boxplot(returns[,1:7], horizontal=TRUE, main="Stock Returns")
T
skewness <- skewness(returns)
print(skewness)
TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE
0.302171129 0.270000116 -0.084663278 -0.306955539 0.003985138 0.292989215 0.088190876 -0.023966773 13.960298074
ETH LTC MONERO
1.423177167 1.950153918 2.229433503
The results for the D’agostino skewness tests show skewness for all assets except Apple, Microsoft, Google and Bitcoin. Statistically significant skewness for the cryptocurrencies is expected given previous results and statistics. Similarly for Tesla, skewness is expected. However the statistically significant skewness for Amazon, Meta and Netflix does indicate a slightly more irregular pattern of returns than expected.
for (i in 1:ncol(returns)){
print(colnames(returns)[i])
print(agostino.test(returns[,i], alternative = "two.sided"))
}
[1] "TSLA"
D'Agostino skewness test
data: returns[, i]
skew = 0.30217, z = 4.67614, p-value = 2.923e-06
alternative hypothesis: data have a skewness
[1] "AMZN"
D'Agostino skewness test
data: returns[, i]
skew = 0.2700, z = 4.1957, p-value = 2.72e-05
alternative hypothesis: data have a skewness
[1] "AAPL"
D'Agostino skewness test
data: returns[, i]
skew = -0.084663, z = -1.336229, p-value = 0.1815
alternative hypothesis: data have a skewness
[1] "META"
D'Agostino skewness test
data: returns[, i]
skew = -0.30696, z = -4.74708, p-value = 2.064e-06
alternative hypothesis: data have a skewness
[1] "GOOG"
D'Agostino skewness test
data: returns[, i]
skew = 0.0039851, z = 0.0630073, p-value = 0.9498
alternative hypothesis: data have a skewness
[1] "NFLX"
D'Agostino skewness test
data: returns[, i]
skew = 0.29299, z = 4.53962, p-value = 5.636e-06
alternative hypothesis: data have a skewness
[1] "MSFT"
D'Agostino skewness test
data: returns[, i]
skew = 0.088191, z = 1.391697, p-value = 0.164
alternative hypothesis: data have a skewness
[1] "BTC"
D'Agostino skewness test
data: returns[, i]
skew = -0.023967, z = -0.378876, p-value = 0.7048
alternative hypothesis: data have a skewness
[1] "DOGE"
D'Agostino skewness test
data: returns[, i]
skew = 13.960, z = 45.855, p-value < 2.2e-16
alternative hypothesis: data have a skewness
[1] "ETH"
D'Agostino skewness test
data: returns[, i]
skew = 1.4232, z = 17.1159, p-value < 2.2e-16
alternative hypothesis: data have a skewness
[1] "LTC"
D'Agostino skewness test
data: returns[, i]
skew = 1.9502, z = 20.7856, p-value < 2.2e-16
alternative hypothesis: data have a skewness
[1] "MONERO"
D'Agostino skewness test
data: returns[, i]
skew = 2.2294, z = 22.4047, p-value < 2.2e-16
alternative hypothesis: data have a skewness
Kurtosis values for all stocks are seen to be much greater than the threshold of 3 to be considered normally distributed. These results indicate a higher peak and fatter tails than a normal distribution, and more extreme values.
kurtosis <- kurtosis(returns)
print(kurtosis)
TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC
8.540201 7.583795 9.317575 14.205735 9.083281 8.609348 13.169041 9.203695 335.108786 12.607643 21.022295
MONERO
23.811982
Further testing for normality is done with the plotting of qqnorm.
From the plots none of the them have the straight line diagonal trend which would be seen if the returns were normally distributed. This is a strong indicator of non-normal distribution of returns for all of the selected assets. This may be cause of concern for investment strategy as normally distibuted returns are easier to predict and account for.
par(mfrow=c(2,6))
sapply(returns, qqnorm)
TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE
x numeric,1485 numeric,1485 numeric,1485 numeric,1485 numeric,1485 numeric,1485 numeric,1485 numeric,1485 numeric,1485
y xts,1485 xts,1485 xts,1485 xts,1485 xts,1485 xts,1485 xts,1485 xts,1485 xts,1485
ETH LTC MONERO
x numeric,1485 numeric,1485 numeric,1485
y xts,1485 xts,1485 xts,1485
Next, the shapiro-wilkes test for normality is executed to further investigate the results of the qqplot and normality of the returns distributions.
The results from this test show a statistically significant result for all of the returns being non-normally distributed. Further solidifying the results of the qqnorm and kurtosis tests.
for (i in 1:ncol(returns))
{
print(colnames(returns)[i])
print(shapiro.test(as.vector(returns[,i])))
}
[1] "TSLA"
Shapiro-Wilk normality test
data: as.vector(returns[, i])
W = 0.92261, p-value < 2.2e-16
[1] "AMZN"
Shapiro-Wilk normality test
data: as.vector(returns[, i])
W = 0.93886, p-value < 2.2e-16
[1] "AAPL"
Shapiro-Wilk normality test
data: as.vector(returns[, i])
W = 0.92321, p-value < 2.2e-16
[1] "META"
Shapiro-Wilk normality test
data: as.vector(returns[, i])
W = 0.89838, p-value < 2.2e-16
[1] "GOOG"
Shapiro-Wilk normality test
data: as.vector(returns[, i])
W = 0.91942, p-value < 2.2e-16
[1] "NFLX"
Shapiro-Wilk normality test
data: as.vector(returns[, i])
W = 0.93927, p-value < 2.2e-16
[1] "MSFT"
Shapiro-Wilk normality test
data: as.vector(returns[, i])
W = 0.89955, p-value < 2.2e-16
[1] "BTC"
Shapiro-Wilk normality test
data: as.vector(returns[, i])
W = 0.91313, p-value < 2.2e-16
[1] "DOGE"
Shapiro-Wilk normality test
data: as.vector(returns[, i])
W = 0.41343, p-value < 2.2e-16
[1] "ETH"
Shapiro-Wilk normality test
data: as.vector(returns[, i])
W = 0.8726, p-value < 2.2e-16
[1] "LTC"
Shapiro-Wilk normality test
data: as.vector(returns[, i])
W = 0.82915, p-value < 2.2e-16
[1] "MONERO"
Shapiro-Wilk normality test
data: as.vector(returns[, i])
W = 0.83554, p-value < 2.2e-16
Looking next at the density of all returns. The kernel density plot shows the estimated probability density function for the sample of returns within this period. This plot shows a non bell-shaped curve, indicating that a normal distribution may not be a suitable model. It also shows the presence of an outlier with the scale of the x axis accomodating a much higher x value.
Next auto-correlation is calculated for each variables in the time series.
The auto-correlation plots show a fast decay to zero, indicating the series of returns for these stocks and cryptocurrencies is stationary. Also present is that several of the autocorrelations fall outside the test bounds, indicatin that the change in price is not white noise.
The Box-Ljung test was calculated for each variable to test whether the null hypothesis stating the series is not autocorrelated can be rejected.
The stocks and cryptocurrencies that found p-values to be significant, thus indicating the series is not auto-correlated are; Apple, Meta, Google, Netflix, Microsoft, Litecoin and Monero.
This states that these returns for these series are auto-correlated and thus can be further investigated for the cause of the non-randomness that influences the change in price. Although the other stocks and cryptocurrencies did not test significantly to reject the null hypothesis, it cannot be stated with certainty that the change in price is independent of one another and as such may still be investiagted for the deterministic factors which may cause the change.
return.density <- density(returns)
plot(return.density)
## acf on returns
par(mfrow=c(2,6))
lapply(c(returns), acf)
$TSLA
Autocorrelations of series ‘X[[i]]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.000 0.000 0.039 0.044 -0.029 -0.043 -0.017 0.028 0.015 0.015 0.046 -0.020 0.023 0.051 -0.016 0.015 -0.021 -0.038
18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.023 -0.054 -0.010 0.014 -0.037 0.000 0.024 0.017 -0.030 -0.018 -0.005 0.008 -0.023 0.006
$AMZN
Autocorrelations of series ‘X[[i]]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.000 -0.033 0.008 -0.056 -0.013 0.023 -0.033 0.017 -0.079 0.067 -0.047 -0.019 0.007 -0.026 -0.020 -0.014 0.039 0.017
18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.066 0.003 -0.004 -0.021 -0.028 0.032 -0.022 -0.033 -0.013 0.028 -0.002 -0.046 -0.038 0.005
$AAPL
Autocorrelations of series ‘X[[i]]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.000 -0.107 0.022 -0.023 -0.011 0.030 -0.057 0.122 -0.094 0.119 -0.030 -0.002 0.030 -0.061 0.056 -0.065 0.061 -0.011
18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.065 0.031 -0.041 0.034 -0.088 0.021 -0.042 0.029 -0.034 0.011 0.037 -0.019 -0.005 -0.051
$META
Autocorrelations of series ‘X[[i]]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.000 -0.075 0.017 -0.063 0.012 -0.019 -0.057 0.034 -0.084 0.062 -0.064 0.029 0.016 -0.029 0.016 -0.035 0.074 -0.010
18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.071 -0.028 0.029 0.006 -0.016 -0.006 -0.032 -0.003 -0.040 0.031 -0.020 -0.043 0.018 -0.002
$GOOG
Autocorrelations of series ‘X[[i]]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.000 -0.104 0.012 -0.017 -0.024 -0.006 -0.086 0.132 -0.116 0.079 -0.023 -0.037 0.031 -0.055 0.036 -0.060 0.103 -0.014
18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.044 0.007 -0.055 0.023 -0.090 0.017 -0.029 0.052 -0.033 0.007 0.021 -0.071 -0.013 -0.018
$NFLX
Autocorrelations of series ‘X[[i]]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.000 -0.049 0.032 -0.007 0.021 -0.091 -0.019 0.002 -0.049 0.012 -0.041 0.005 0.032 -0.001 0.022 -0.023 0.047 -0.034
18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.037 -0.026 -0.025 0.012 -0.067 0.014 -0.015 0.021 -0.022 -0.015 -0.011 0.006 0.008 0.018
$MSFT
Autocorrelations of series ‘X[[i]]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.000 -0.231 0.033 0.019 -0.047 0.000 -0.082 0.128 -0.149 0.121 -0.063 0.004 0.020 -0.065 0.053 -0.082 0.082 -0.025
18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.022 -0.022 -0.033 0.050 -0.088 0.033 -0.054 0.010 -0.047 0.017 -0.004 -0.016 0.042 -0.053
$BTC
Autocorrelations of series ‘X[[i]]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.000 -0.017 0.013 0.048 0.026 -0.003 -0.005 0.061 -0.015 0.001 0.032 0.026 0.046 0.004 -0.002 -0.032 0.000 -0.034
18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.005 0.009 0.002 -0.025 0.024 0.003 0.017 -0.002 -0.007 -0.002 0.080 0.008 0.040 0.012
$DOGE
Autocorrelations of series ‘X[[i]]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.000 0.069 0.006 -0.013 0.020 0.035 -0.049 0.104 0.004 0.040 0.058 0.011 -0.016 -0.001 0.002 -0.009 -0.023 0.142
18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.019 0.001 0.000 0.008 0.047 -0.042 0.001 0.009 0.043 0.005 0.003 -0.009 -0.015 0.016
$ETH
Autocorrelations of series ‘X[[i]]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.000 0.042 0.026 0.035 0.011 0.010 0.003 0.042 0.006 0.030 0.044 0.054 0.055 0.063 0.035 -0.007 0.000 0.023
18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.028 -0.023 -0.013 0.009 -0.009 -0.048 -0.024 0.000 0.001 0.018 0.045 0.006 0.041 -0.019
$LTC
Autocorrelations of series ‘X[[i]]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.000 0.061 0.019 0.002 0.059 0.003 -0.025 0.040 -0.061 0.007 0.063 0.081 0.022 -0.021 -0.006 -0.044 0.035 0.019
18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.019 0.009 0.002 0.027 0.030 0.019 0.006 -0.023 -0.019 0.000 0.033 0.011 0.023 -0.043
$MONERO
Autocorrelations of series ‘X[[i]]’, by lag
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
1.000 -0.038 -0.015 0.050 0.088 0.044 -0.009 -0.036 0.038 0.014 0.026 0.044 0.014 0.037 -0.008 -0.028 -0.063 0.005
18 19 20 21 22 23 24 25 26 27 28 29 30 31
0.007 0.051 -0.057 -0.009 0.027 -0.009 -0.016 -0.026 -0.044 -0.025 -0.031 0.015 -0.010 -0.030
## arima and box-ljung tests
lapply(c(returns), Box.test, lag = 5, type = "Ljung-Box")
$TSLA
Box-Ljung test
data: X[[i]]
X-squared = 9.2771, df = 5, p-value = 0.09851
$AMZN
Box-Ljung test
data: X[[i]]
X-squared = 7.5394, df = 5, p-value = 0.1835
$AAPL
Box-Ljung test
data: X[[i]]
X-squared = 20.15, df = 5, p-value = 0.001171
$META
Box-Ljung test
data: X[[i]]
X-squared = 15.406, df = 5, p-value = 0.008761
$GOOG
Box-Ljung test
data: X[[i]]
X-squared = 17.653, df = 5, p-value = 0.003414
$NFLX
Box-Ljung test
data: X[[i]]
X-squared = 18.24, df = 5, p-value = 0.00266
$MSFT
Box-Ljung test
data: X[[i]]
X-squared = 84.678, df = 5, p-value < 2.2e-16
$BTC
Box-Ljung test
data: X[[i]]
X-squared = 5.2502, df = 5, p-value = 0.3861
$DOGE
Box-Ljung test
data: X[[i]]
X-squared = 9.7294, df = 5, p-value = 0.08328
$ETH
Box-Ljung test
data: X[[i]]
X-squared = 5.7273, df = 5, p-value = 0.3337
$LTC
Box-Ljung test
data: X[[i]]
X-squared = 11.331, df = 5, p-value = 0.0452
$MONERO
Box-Ljung test
data: X[[i]]
X-squared = 20.813, df = 5, p-value = 0.0008788
To calculate the correlation of returns, the spearman method of calculating correlation was used. This is because this method does not assume linearity between the returns of the variables and it assumes the variables change together but not at a constant rate.
Looking at the heatmap of correlation coefficients, there is a clear divide between the returns. The stock market returns are all correlated with each other to a significantly higher degree than the cryptocurrencies. Similarly the cryptocurrencies have much higher correlation with each other compared to the stock returns. The notable exception to this is Monero, which has no meaningful correlation with any other variable.
These results are to be expected as cryptocurrencies have been found to be positively correlated with one another (Blockworks, 2023). The notable exception of Monero may be explained in part by it’s difference to the other selected cryptocurrencies as taking a priority on privacy and anonymity of transaction and user. This has caused some divide in the acceptance of it as and caused a decrease in Bitcoin’s influence of its pricing (Sephton, 2022).
The pearson correlation was also executed but did not show any further findings or insights.
# Spearman Correlation
corr.matrix <- cor(returns, method = "spearman")
corr.df <- as.data.frame(corr.matrix)
heatmaply(corr.df)
## Pearson Correlation
corr.matrix.p <- cor(returns, method = "pearson")
corr.df.p <- as.data.frame(corr.matrix)
heatmaply(corr.df.p)
To further investigate the presence of volatility clustering, an ARMA GARCH model is fit.
The results of this model show the estimated value of the parameter beta1 is close to 1 (0.948964), which suggests that there is evidence of volatility clustering.
arma.garch.norm = ugarchspec(mean.model=list(armaOrder=c(1,0)),
variance.model=list(garchOrder=c(1,1)))
returns.garch.norm = ugarchfit(data=returns, spec=arma.garch.norm)
returns.garch.norm
*---------------------------------*
* GARCH Model Fit *
*---------------------------------*
Conditional Variance Dynamics
-----------------------------------
GARCH Model : sGARCH(1,1)
Mean Model : ARFIMA(1,0,0)
Distribution : norm
Optimal Parameters
------------------------------------
Estimate Std. Error t value Pr(>|t|)
mu 0.001492 0.000167 8.911041 0.0000
ar1 -0.000482 0.008979 -0.053668 0.9572
omega 0.000004 0.000001 5.323143 0.0000
alpha1 0.050036 0.001829 27.350860 0.0000
beta1 0.948964 0.001991 476.731809 0.0000
Robust Standard Errors:
Estimate Std. Error t value Pr(>|t|)
mu 0.001492 0.000176 8.466766 0.000000
ar1 -0.000482 0.023942 -0.020127 0.983942
omega 0.000004 0.000006 0.656578 0.511453
alpha1 0.050036 0.011370 4.400881 0.000011
beta1 0.948964 0.014528 65.318588 0.000000
LogLikelihood : 35065.96
Error in xts(object@fit$fitted.values, D) :
'order.by' cannot contain 'NA', 'NaN', or 'Inf'
Causality
Causality tests are executed to further understand the relationships each variable has onto another. Knowing whether a time series has useful information for forecasting another is critical to understand the underlying elements that influence the change in a time series.
Although this is a multivariate time series. A Granger test for causality may be executed iteratively on each of the variables to analyse for causality within the time series. From this series of granger tests, a heatmap is applied to visualise the significance of the results of the granger test for each variable onto another. From this heatmap several significant p-values can be seen showing that there is information present within the returns time series for one variable that explains the change in another.
Notably there is significance for almost all stock prices on each other. This can be interpreted as these stock prices having some inter-dependency on each other as a change in one has the significant implication of a change on many others and vice versa. This leads to the necessity in an investigation of causality for this returns time series as a multivariate time series.
A causality from a VAR model is the obvious choice to understand causality for a multivariate time series. To build a VAR model, the VARselect function is used to calculate the optimal lag order for a VAR estimation. The output from this is p=1 which indicates an optimal lag order of 1 to be used for the VAR model.
The causality function calculates the significance of each variable not granger causing all others and there being no instantaneous causality between them.
From the results many of the tests are significant indicating a rejection of the hypothesis that there is no causality between the variables. This again indicates that information is present in the individual time series that may help to explain the change in the others.
## granger test for causality for each ts permutation
g.test <- data.frame(matrix(ncol=12,nrow=12), row.names = colnames(returns))
g.result <- c()
for (i in 1:ncol(returns))
{
for (j in 1:ncol(returns))
{ if (i != j)
{g.result <- grangertest(returns[,i],returns[,j], order=1)
g.test[i,j] <- g.result$`Pr(>F)`[2]
}
}
}
colnames(g.test) <- colnames(returns)
g.test
heatmaply(g.test)
## Causality based on VAR
VARselect(returns, lag.max = 10)
$selection
AIC(n) HQ(n) SC(n) FPE(n)
1 1 1 1
$criteria
1 2 3 4 5 6 7 8
AIC(n) -8.480611e+01 -8.470356e+01 -8.467208e+01 -8.461425e+01 -8.458478e+01 -8.453371e+01 -8.449701e+01 -8.442538e+01
HQ(n) -8.459726e+01 -8.430191e+01 -8.407764e+01 -8.382702e+01 -8.360476e+01 -8.336089e+01 -8.313141e+01 -8.286699e+01
SC(n) -8.424595e+01 -8.362632e+01 -8.307777e+01 -8.250286e+01 -8.195632e+01 -8.138817e+01 -8.083440e+01 -8.024570e+01
FPE(n) 1.476300e-37 1.635792e-37 1.688260e-37 1.789068e-37 1.843073e-37 1.940440e-37 2.014108e-37 2.165283e-37
9 10
AIC(n) -8.432799e+01 -8.424731e+01
HQ(n) -8.257681e+01 -8.230334e+01
SC(n) -7.963124e+01 -7.903348e+01
FPE(n) 2.389056e-37 2.592891e-37
model_aic = VAR(returns, p = 1)
model_bic = VAR(returns, p = 1)
## causality
causality(model_aic, cause = "TSLA")
$Granger
Granger causality H0: TSLA do not Granger-cause AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
data: VAR object model_aic
F-Test = 0.92972, df1 = 11, df2 = 17652, p-value = 0.5101
$Instant
H0: No instantaneous causality between: TSLA and AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
data: VAR object model_aic
Chi-squared = 265.15, df = 11, p-value < 2.2e-16
causality(model_aic, cause = "AMZN")
$Granger
Granger causality H0: AMZN do not Granger-cause TSLA AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
data: VAR object model_aic
F-Test = 1.1993, df1 = 11, df2 = 17652, p-value = 0.281
$Instant
H0: No instantaneous causality between: AMZN and TSLA AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
data: VAR object model_aic
Chi-squared = 558.86, df = 11, p-value < 2.2e-16
causality(model_aic, cause = "AAPL")
$Granger
Granger causality H0: AAPL do not Granger-cause TSLA AMZN META GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
data: VAR object model_aic
F-Test = 0.62763, df1 = 11, df2 = 17652, p-value = 0.8068
$Instant
H0: No instantaneous causality between: AAPL and TSLA AMZN META GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
data: VAR object model_aic
Chi-squared = 524.85, df = 11, p-value < 2.2e-16
causality(model_aic, cause = "META")
$Granger
Granger causality H0: META do not Granger-cause TSLA AMZN AAPL GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
data: VAR object model_aic
F-Test = 1.9929, df1 = 11, df2 = 17652, p-value = 0.02505
$Instant
H0: No instantaneous causality between: META and TSLA AMZN AAPL GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
data: VAR object model_aic
Chi-squared = 519.96, df = 11, p-value < 2.2e-16
causality(model_aic, cause = "GOOG")
$Granger
Granger causality H0: GOOG do not Granger-cause TSLA AMZN AAPL META NFLX MSFT BTC DOGE ETH LTC MONERO
data: VAR object model_aic
F-Test = 1.2902, df1 = 11, df2 = 17652, p-value = 0.2226
$Instant
H0: No instantaneous causality between: GOOG and TSLA AMZN AAPL META NFLX MSFT BTC DOGE ETH LTC MONERO
data: VAR object model_aic
Chi-squared = 599.28, df = 11, p-value < 2.2e-16
causality(model_aic, cause = "NFLX")
$Granger
Granger causality H0: NFLX do not Granger-cause TSLA AMZN AAPL META GOOG MSFT BTC DOGE ETH LTC MONERO
data: VAR object model_aic
F-Test = 1.069, df1 = 11, df2 = 17652, p-value = 0.3821
$Instant
H0: No instantaneous causality between: NFLX and TSLA AMZN AAPL META GOOG MSFT BTC DOGE ETH LTC MONERO
data: VAR object model_aic
Chi-squared = 406.43, df = 11, p-value < 2.2e-16
causality(model_aic, cause = "MSFT")
$Granger
Granger causality H0: MSFT do not Granger-cause TSLA AMZN AAPL META GOOG NFLX BTC DOGE ETH LTC MONERO
data: VAR object model_aic
F-Test = 3.6224, df1 = 11, df2 = 17652, p-value = 3.852e-05
$Instant
H0: No instantaneous causality between: MSFT and TSLA AMZN AAPL META GOOG NFLX BTC DOGE ETH LTC MONERO
data: VAR object model_aic
Chi-squared = 608.77, df = 11, p-value < 2.2e-16
causality(model_aic, cause = "BTC")
$Granger
Granger causality H0: BTC do not Granger-cause TSLA AMZN AAPL META GOOG NFLX MSFT DOGE ETH LTC MONERO
data: VAR object model_aic
F-Test = 1.9362, df1 = 11, df2 = 17652, p-value = 0.03049
$Instant
H0: No instantaneous causality between: BTC and TSLA AMZN AAPL META GOOG NFLX MSFT DOGE ETH LTC MONERO
data: VAR object model_aic
Chi-squared = 413.76, df = 11, p-value < 2.2e-16
causality(model_aic, cause = "LTC")
$Granger
Granger causality H0: LTC do not Granger-cause TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH MONERO
data: VAR object model_aic
F-Test = 0.51135, df1 = 11, df2 = 17652, p-value = 0.8972
$Instant
H0: No instantaneous causality between: LTC and TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH MONERO
data: VAR object model_aic
Chi-squared = 393.18, df = 11, p-value < 2.2e-16
causality(model_aic, cause = "ETH")
$Granger
Granger causality H0: ETH do not Granger-cause TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE LTC MONERO
data: VAR object model_aic
F-Test = 1.8178, df1 = 11, df2 = 17652, p-value = 0.04549
$Instant
H0: No instantaneous causality between: ETH and TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE LTC MONERO
data: VAR object model_aic
Chi-squared = 300.06, df = 11, p-value < 2.2e-16
causality(model_aic, cause = "DOGE")
$Granger
Granger causality H0: DOGE do not Granger-cause TSLA AMZN AAPL META GOOG NFLX MSFT BTC ETH LTC MONERO
data: VAR object model_aic
F-Test = 0.9988, df1 = 11, df2 = 17652, p-value = 0.4444
$Instant
H0: No instantaneous causality between: DOGE and TSLA AMZN AAPL META GOOG NFLX MSFT BTC ETH LTC MONERO
data: VAR object model_aic
Chi-squared = 186.45, df = 11, p-value < 2.2e-16
causality(model_aic, cause = "MONERO")
$Granger
Granger causality H0: MONERO do not Granger-cause TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC
data: VAR object model_aic
F-Test = 34.015, df1 = 11, df2 = 17652, p-value < 2.2e-16
$Instant
H0: No instantaneous causality between: MONERO and TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC
data: VAR object model_aic
Chi-squared = 59.472, df = 11, p-value = 1.162e-08
Statistical Factor Modelling
Statistical factor modelling using the factanal function is used on the data set in order to understand how many latent variables are sufficient for modelling the change in returns. A for loop is ran for all valid numbers of factors for both a statistical model with no rotation and a model with a varimax rotation to maximise interpretability of the loadings.
For both rotations, the highest factor with a significant p-value is found to be with a model with 4 factors, indicating the best fit model for the data.
Looking specifically at the factor analysis of 4 factors with the varimax rotation.
Factor 1 has strong loading for the traditional financial assets, although a slightly lower loading for Tesla and Netflix may indicate a tech market factor.
Factor 2 can be seen to have strong loadings on Bitcoin, Litecoin, Ethereum and Dogecoin with a slightly weaker loading for Monero. It also has very weak loadings on the traditional financial assets indicating a factor that represents cryptocurrencies.
Factor 3 is has a high loading on Google with more marginal loadings on Meta and Microsoft. This is harder to interpret but again favours the traditional financial assets.
Factor 4 is also harder to interpret
print(factanal(returns, factors=4, rotation = "varimax"), cutoff=0.1)
Call:
factanal(x = returns, factors = 4, rotation = "varimax")
Uniquenesses:
TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
0.767 0.222 0.213 0.429 0.005 0.567 0.262 0.342 0.808 0.607 0.412 0.974
Loadings:
Factor1 Factor2 Factor3 Factor4
TSLA 0.472
AMZN 0.859 -0.187
AAPL 0.783 0.414
META 0.741 0.143
GOOG 0.817 0.568
NFLX 0.644 -0.128
MSFT 0.833 0.144 0.143
BTC 0.803
DOGE 0.435
ETH 0.621
LTC 0.756 0.108
MONERO 0.141
Factor1 Factor2 Factor3 Factor4
SS loadings 3.914 1.829 0.374 0.276
Proportion Var 0.326 0.152 0.031 0.023
Cumulative Var 0.326 0.479 0.510 0.533
Test of the hypothesis that 4 factors are sufficient.
The chi square statistic is 51.34 on 24 degrees of freedom.
The p-value is 0.000953
Next Bartlett’s test of sphericity is used to determine if factor analysis should be employed at all. As this test is statistically significant, factor analysis can be executed. Although this test only checks for significant similarity of the correlation of the returns to the identity matrix. It is essential to know that it is not similar to the identity matrix to move forward with further factor modelling. As the p-value is < 0.05 the correlation matrix is significantly different from the identity matrix and thus further factor analysis can proceed.
## bartlett sphericity testing
cortest.bartlett(returns)
R was not square, finding R from data
$chisq
[1] 7387.743
$p.value
[1] 0
$df
[1] 66
The Kaiser-Meyer-Olkin measure of sampling adequacy is used to determine the measure of factorability of the data set. As a reasonable cutoff is defined as an MSA of 0.6, the results of 0.88 indicates a very good score and show that this dataset may be explained well by latent variables expressed as factor.
## Kaiser-Meyer-Olkin factor adequacy
KMO(returns)
Kaiser-Meyer-Olkin factor adequacy
Call: KMO(r = returns)
Overall MSA = 0.88
MSA for each item =
TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
0.95 0.92 0.92 0.92 0.88 0.93 0.87 0.72 0.82 0.81 0.73 0.67
## 0.87 is very good score
In order to decide on the number of factors to extract, a scree plot and parallel analysis is examined to see at what point factors no longer have sufficient meaningful variance.
From the plot, the maximum number of factors to be extracted is 4. With a recommendation of factors of 4. Although a 4 factor model is recommended by the parallel analysis function, looking at the scree plot it seems 3 would be a sufficient number of factors to capture meaningful variance.
## scree plot
scree(returns)
## use a max of 3 factors
## parallel analysis
fa.parallel(returns)
Parallel analysis suggests that the number of factors = 4 and the number of components = 2
## use 2 factors
As the recommended number of factors to be extracted is 4, a factor model with 4 factors may be constructed an analysed.
In this factor model: Factor 1 has large loadings on Google and Microsoft, with slighlty lower but still significant loadings for Apple and Meta. This may indicate a tech conglomerate factor. Factor 2 has large loading for Bitcoin Ethereum, Litecoin and Dogecoin representing more prominent cryptocurrencies. Factor 3 has large loadings on Amazon and Netflix with a weaker loading on Tesla. This is more difficult to interpret but may still be related to their prominence as online streaming services. Factor 4 has no significant loadings and looking at the proportion of variance explained of 0.02 leads to a consideration of a 3 factor model.
The factor diagram again demonstrates the difference between loadings for stocks and cryptocurrencies. Demonstrating the difference in underlying causes of change in returns for them as financial instruments.
## factor model with 4 factors
factor4.model <- fa(returns, nfactors
= 4, fm="ols", max.iter = 100, rotate
= "oblimin")
Loading required namespace: GPArotation
factor4.model
Factor Analysis using method = ols
Call: fa(r = returns, nfactors = 4, rotate = "oblimin", max.iter = 100,
fm = "ols")
Standardized loadings (pattern matrix) based upon correlation matrix
[,1] [,2] [,3] [,4]
SS loadings 2.81 1.85 1.32 0.22
Proportion Var 0.23 0.15 0.11 0.02
Cumulative Var 0.23 0.39 0.50 0.52
Proportion Explained 0.45 0.30 0.21 0.04
Cumulative Proportion 0.45 0.75 0.96 1.00
With factor correlations of
[,1] [,2] [,3] [,4]
[1,] 1.00 0.17 0.85 -0.13
[2,] 0.17 1.00 0.11 0.01
[3,] 0.85 0.11 1.00 -0.08
[4,] -0.13 0.01 -0.08 1.00
Mean item complexity = 1.3
Test of the hypothesis that 4 factors are sufficient.
df null model = 66 with the objective function = 4.99 with Chi Square = 7387.74
df of the model are 24 and the objective function was 0.07
The root mean square of the residuals (RMSR) is 0.01
The df corrected root mean square of the residuals is 0.02
The harmonic n.obs is 1485 with the empirical chi square 32.86 with prob < 0.11
The total n.obs was 1485 with Likelihood Chi Square = 98.98 with prob < 4.5e-11
Tucker Lewis Index of factoring reliability = 0.972
RMSEA index = 0.046 and the 90 % confidence intervals are 0.037 0.055
BIC = -76.29
Fit based upon off diagonal values = 1
Measures of factor score adequacy
[,1] [,2] [,3] [,4]
Correlation of (regression) scores with factors 0.96 0.90 0.91 0.52
Multiple R square of scores with factors 0.92 0.81 0.83 0.27
Minimum correlation of possible factor scores 0.84 0.62 0.65 -0.46
fa.diagram(factor4.model)
factor4.model$communality
TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC
0.27778817 0.70842401 0.62512360 0.58071735 0.78176187 0.49531934 0.76929788 0.64831041 0.21365024 0.41167593 0.60809810
MONERO
0.07679296
factor4.model$e.values
[1] 4.4138594 2.2449726 0.9936580 0.7793188 0.7468769 0.6057978 0.5504472 0.4292004 0.3706688 0.3407758 0.3202190 0.2042053
100*factor4.model$e.values/length(factor4.model$e.values)
[1] 36.782162 18.708105 8.280483 6.494323 6.223974 5.048315 4.587060 3.576670 3.088906 2.839798 2.668492 1.701711
print(factor4.model$Structure, cutoff=0, digits=3)
Loadings:
[,1] [,2] [,3] [,4]
TSLA 0.448 0.105 0.468 -0.265
AMZN 0.782 0.112 0.827 -0.045
AAPL 0.770 0.152 0.665 -0.269
META 0.756 0.099 0.691 -0.073
GOOG 0.880 0.147 0.737 -0.024
NFLX 0.584 0.080 0.703 -0.064
MSFT 0.876 0.157 0.751 -0.150
BTC 0.135 0.805 0.091 0.000
DOGE 0.048 0.429 0.046 0.174
ETH 0.126 0.623 0.101 0.152
LTC 0.126 0.775 0.076 -0.075
MONERO -0.006 0.134 0.023 0.238
[,1] [,2] [,3] [,4]
SS loadings 3.910 1.949 3.454 0.293
Proportion Var 0.326 0.162 0.288 0.024
Cumulative Var 0.326 0.488 0.776 0.801
Looking now at a 3 factor model, there is no further insight or information to be gained. It reflects a similar structure and explanation as the 4 factor model but with a much lower cumulative variance (a decrease from 0.801 to 0.512)
## again with 3 factors
factor3.model <- fa(returns, nfactors
= 3, fm="ols", max.iter = 100, rotate
= "oblimin")
factor3.model
Factor Analysis using method = ols
Call: fa(r = returns, nfactors = 3, rotate = "oblimin", max.iter = 100,
fm = "ols")
Standardized loadings (pattern matrix) based upon correlation matrix
[,1] [,2] [,3]
SS loadings 3.94 1.85 0.20
Proportion Var 0.33 0.15 0.02
Cumulative Var 0.33 0.48 0.50
Proportion Explained 0.66 0.31 0.03
Cumulative Proportion 0.66 0.97 1.00
With factor correlations of
[,1] [,2] [,3]
[1,] 1.00 0.16 -0.02
[2,] 0.16 1.00 -0.02
[3,] -0.02 -0.02 1.00
Mean item complexity = 1.1
Test of the hypothesis that 3 factors are sufficient.
df null model = 66 with the objective function = 4.99 with Chi Square = 7387.74
df of the model are 33 and the objective function was 0.12
The root mean square of the residuals (RMSR) is 0.02
The df corrected root mean square of the residuals is 0.03
The harmonic n.obs is 1485 with the empirical chi square 61.94 with prob < 0.0017
The total n.obs was 1485 with Likelihood Chi Square = 180.4 with prob < 3.1e-22
Tucker Lewis Index of factoring reliability = 0.96
RMSEA index = 0.055 and the 90 % confidence intervals are 0.047 0.063
BIC = -60.61
Fit based upon off diagonal values = 1
Measures of factor score adequacy
[,1] [,2] [,3]
Correlation of (regression) scores with factors 0.96 0.90 0.51
Multiple R square of scores with factors 0.92 0.81 0.26
Minimum correlation of possible factor scores 0.84 0.62 -0.47
fa.diagram(factor3.model)
factor3.model$communality
TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
0.2378519 0.6950366 0.6248142 0.5818147 0.7228166 0.4105847 0.7542626 0.6473513 0.2125866 0.4111961 0.6086903 0.0822757
factor3.model$e.values
[1] 4.4138594 2.2449726 0.9936580 0.7793188 0.7468769 0.6057978 0.5504472 0.4292004 0.3706688 0.3407758 0.3202190 0.2042053
100*factor3.model$e.values/length(factor3.model$e.values)
[1] 36.782162 18.708105 8.280483 6.494323 6.223974 5.048315 4.587060 3.576670 3.088906 2.839798 2.668492 1.701711
print(factor3.model$Structure, cutoff=0, digits=3)
Loadings:
[,1] [,2] [,3]
TSLA 0.475 0.098 -0.120
AMZN 0.821 0.109 0.125
AAPL 0.771 0.152 -0.189
META 0.762 0.100 0.008
GOOG 0.850 0.152 -0.002
NFLX 0.631 0.076 0.098
MSFT 0.866 0.159 -0.081
BTC 0.132 0.805 -0.026
DOGE 0.046 0.430 0.156
ETH 0.122 0.624 0.132
LTC 0.122 0.775 -0.105
MONERO -0.003 0.134 0.249
[,1] [,2] [,3]
SS loadings 3.994 1.949 0.198
Proportion Var 0.333 0.162 0.016
Cumulative Var 0.333 0.495 0.512
Economic Factor Modelling
The first Economic factor model is undertaken with conventional economic factors:
Interest rate - In this case the interest rate is represented by United States - Market Yield on U.S. Treasury Securities at 10-Year Constant Maturity. This shows the return of a risk free investment over a 10 year period which can be measured as a benchmark return for a portfolio to beat. As such it is a marker of success for the returns of a financial investment to beat.
Core Consumer Price Index - This is a measure of how core consumer prices shift, excluding costs of staples such as energy and food which are typically the more volatile components of the broader consumer price index (Davidson & Jones, 2023). As such CCPI is a widely used measure of inflation.
The data on these 2 Macroeconomic factors is gathered from https://www.kaggle.com/datasets/calven22/usa-key-macroeconomic-indicators
The data set has monthly data on these factors ranging from 1981-01-01 to 2021-01-10.
Next, a times series for the monthly change in these factors and returns is constructed. A log transformation of the returns is used to reduce the skewness and scale difference between the two variables.
The log of monthly returns for the selected prices is also calculated for consistency.
This produces the log of the change per month for 65 months. This should be sufficient to train an economic factor model on for a monthly scale.
macros <- read.csv("macros.csv")
macros$Date <- as.POSIXct(macros$Date, format = "%d/%m/%Y")
macros.xts <- xts(macros[,c(2,3)], order.by = macros$Date)
macros.change <- Return.calculate(macros.xts, method = "log")
index(macros.change) <- as.yearmon(index(macros.change))
head(macros.xts)
ir ccpi
1981-01-01 12.56857 85.4
1981-02-01 13.19444 85.9
1981-03-01 13.11591 86.4
1981-04-01 13.67952 87.0
1981-05-01 14.09950 87.8
1981-06-01 13.47227 88.6
tail(macros.xts)
ir ccpi
2021-05-01 1.621000 275.718
2021-06-01 1.519091 278.140
2021-07-01 1.318571 279.054
2021-08-01 1.283182 279.338
2021-09-01 1.374762 280.017
2021-10-01 1.582500 281.695
## calculating log of monthly returns for selected prices
prices.full <- na.omit(prices)
prices.monthly <- apply.monthly(prices.full, first)
## Calculate monthly log returns
returns.monthly <- Return.calculate(prices.monthly, method = "log")
index(returns.monthly) <- as.yearmon(index(returns.monthly))
## combining the factors and returns together
everything <- cbind(macros.change, returns.monthly)
## removing the incomplete rows
everything <- na.omit(everything)
head(everything)
ir ccpi TSLA AMZN AAPL META GOOG NFLX MSFT
Sep 2015 -0.048494350 0.001855943 -0.01046346 -0.053827683 -0.10562041 -0.07634092 -0.05838350 -0.15097221 -0.123769696
Nov 2015 0.089212529 0.001938489 -0.11514450 0.187884879 0.10062215 0.12742428 0.16522019 0.01554189 0.176851962
Dec 2015 -0.009068479 0.001252077 0.10386724 0.077612141 -0.03220137 0.03621557 0.06174729 0.15247707 0.036515204
Jan 2016 -0.072797596 0.001667001 -0.05985287 -0.063955485 -0.10778755 -0.04682236 -0.03340542 -0.13115273 -0.007635049
Feb 2016 -0.158562636 0.002263159 -0.12610960 -0.102714444 -0.08847077 0.11858703 0.01360272 -0.15586492 -0.001643689
Mar 2016 0.059763287 0.001640135 -0.05527244 0.007332014 0.04163882 -0.04687174 -0.04513926 0.04377233 -0.039710637
BTC DOGE ETH LTC MONERO
Sep 2015 -0.147851628 -0.23007721 0.64597506 -0.33669157 -0.254236329
Nov 2015 0.419027453 0.18521090 0.36048867 0.32744014 0.077002248
Dec 2015 0.003590034 -0.14629660 -0.12349648 -0.19270440 -0.182957434
Jan 2016 0.177956487 0.12113631 0.08717136 0.01745711 0.303200293
Feb 2016 -0.149219334 0.63194327 0.84034550 -0.12741185 0.003528939
Mar 2016 0.153900175 -0.08297657 1.24143096 0.10939270 0.501889893
tail(everything)
ir ccpi TSLA AMZN AAPL META GOOG NFLX MSFT
Feb 2021 0.151504254 0.001014206 0.14044629 0.047868780 0.03589834 -0.02610561 0.09546055 0.03047603 0.09610746
Mar 2021 0.247334656 0.003379423 -0.15610742 -0.060656346 -0.04849573 0.01100745 0.09052947 0.02129153 -0.01137255
Apr 2021 -0.008599562 0.007345988 -0.08218047 0.004712104 -0.03820394 0.11991566 0.02666025 -0.02058681 0.02257603
May 2021 -0.064931173 0.008745980 0.03438504 0.068905495 0.07470008 0.07704531 0.11370036 -0.05783038 0.03849039
Jun 2021 -0.141563170 0.003280727 -0.09328279 -0.050831961 -0.06434736 0.02010170 0.01435887 -0.01989770 -0.01786694
Jul 2021 -0.027206111 0.001017207 0.08303920 0.064463781 0.09941275 0.07394522 0.03936616 0.06676763 0.09332398
BTC DOGE ETH LTC MONERO
Feb 2021 0.04779676 1.2727323 0.2746653 -0.1597435 0.1021942
Mar 2021 0.39196592 0.3721640 0.1335889 0.2837068 0.3770633
Apr 2021 0.17453949 0.2029844 0.2340216 0.1525269 0.1659878
May 2021 -0.03260100 1.9637366 0.5511563 0.3673001 0.3767658
Jun 2021 -0.44419309 -0.1769362 -0.2645561 -0.4765909 -0.2568216
Jul 2021 -0.08867000 -0.4142922 -0.2199255 -0.2851930 -0.3304003
The first thing to evaluate from this Economic factor model is it’s R square values for each of the variables as this represents the proportion of variance explained by the model. Dogecoin and Netflix have the highest R square values from the set of variables but are still very low with the model explaining ~6-7% of the variance in the change of prices for their prices.
Amazon and Microsoft have extremely low R square values showing that this model has neglible explainability for these prices.
Looking at the betas for interest rate and CCPI on each of the returns:
An increase in Interest rate indicates an increase in returns for both Dogecoin and Monero, whereas it shows a decrease in returns for Meta and Tesla. This could prove to be a useful insight for hedging against an expected period of volatility for interest rates. However this must also be taken with a grain of salt as the respective R square values for these stocks and cryptocurrencies is not sufficient to inform investment decisions.
This is an interesting finding as cryptocurrencies have been used to shield from unexpected changes in inflation rates aswell as CCPI (Godwin, 2023). Though the lack of response shown in this model for Ethereum and Bitcoin may be attributed for the low R square value.
Interestingly the CCPI beta for all cryptocurrencies is highly positive. This shows an interesting contrast to the traditional financial instruments that have little to no response to unexpected changes in CCPI. This goes against what is commonly understood to be an important indicator for the health of an Economy which again shows the inadequacy of this model.
arFit1 <- ar(cbind(everything$ir, everything$ccpi))
res1 <- arFit1$res[3:65,]
lmfit1 <- lm(everything[3:65,3:14]~res1[,1]+res1[,2])
slmfit1 <- summary(lmfit1)
slmfit1
Response TSLA :
Call:
lm(formula = TSLA ~ res1[, 1] + res1[, 2])
Residuals:
TSLA
Min -0.44561
1Q -0.11939
Median -0.01154
3Q 0.09284
Max 0.53207
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.04152 0.02284 1.818 0.074 .
res1[, 1] -0.11418 0.22194 -0.514 0.609
res1[, 2] -19.85297 21.66688 -0.916 0.363
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1812 on 60 degrees of freedom
Multiple R-squared: 0.01754, Adjusted R-squared: -0.01521
F-statistic: 0.5355 on 2 and 60 DF, p-value: 0.5882
Response AMZN :
Call:
lm(formula = AMZN ~ res1[, 1] + res1[, 2])
Residuals:
AMZN
Min -0.21438
1Q -0.04642
Median 0.01464
3Q 0.04177
Max 0.15735
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.027986 0.009623 2.908 0.00509 **
res1[, 1] -0.014255 0.093522 -0.152 0.87936
res1[, 2] -2.280271 9.129843 -0.250 0.80363
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.07637 on 60 degrees of freedom
Multiple R-squared: 0.00138, Adjusted R-squared: -0.03191
F-statistic: 0.04146 on 2 and 60 DF, p-value: 0.9594
Response AAPL :
Call:
lm(formula = AAPL ~ res1[, 1] + res1[, 2])
Residuals:
AAPL
Min -0.250962
1Q -0.051945
Median 0.008009
3Q 0.065696
Max 0.174827
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.02513 0.01175 2.139 0.0366 *
res1[, 1] 0.07196 0.11420 0.630 0.5310
res1[, 2] 0.61932 11.14896 0.056 0.9559
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.09326 on 60 degrees of freedom
Multiple R-squared: 0.006591, Adjusted R-squared: -0.02652
F-statistic: 0.199 on 2 and 60 DF, p-value: 0.82
Response META :
Call:
lm(formula = META ~ res1[, 1] + res1[, 2])
Residuals:
META
Min -0.221776
1Q -0.057986
Median 0.003369
3Q 0.058250
Max 0.212182
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.02255 0.01100 2.050 0.0447 *
res1[, 1] -0.09937 0.10688 -0.930 0.3562
res1[, 2] 7.28571 10.43350 0.698 0.4877
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.08727 on 60 degrees of freedom
Multiple R-squared: 0.02283, Adjusted R-squared: -0.009743
F-statistic: 0.7009 on 2 and 60 DF, p-value: 0.5002
Response GOOG :
Call:
lm(formula = GOOG ~ res1[, 1] + res1[, 2])
Residuals:
GOOG
Min -0.25969
1Q -0.03966
Median 0.01218
3Q 0.05285
Max 0.14836
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.020909 0.008784 2.380 0.0205 *
res1[, 1] 0.063964 0.085369 0.749 0.4566
res1[, 2] 2.232128 8.333993 0.268 0.7897
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.06971 on 60 degrees of freedom
Multiple R-squared: 0.01021, Adjusted R-squared: -0.02278
F-statistic: 0.3096 on 2 and 60 DF, p-value: 0.7349
Response NFLX :
Call:
lm(formula = NFLX ~ res1[, 1] + res1[, 2])
Residuals:
NFLX
Min -0.21554
1Q -0.06837
Median 0.01889
3Q 0.06010
Max 0.22582
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.02526 0.01243 2.032 0.0466 *
res1[, 1] 0.06022 0.12078 0.499 0.6199
res1[, 2] -23.37695 11.79114 -1.983 0.0520 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.09863 on 60 degrees of freedom
Multiple R-squared: 0.06626, Adjusted R-squared: 0.03514
F-statistic: 2.129 on 2 and 60 DF, p-value: 0.1279
Response MSFT :
Call:
lm(formula = MSFT ~ res1[, 1] + res1[, 2])
Residuals:
MSFT
Min -0.149720
1Q -0.033762
Median 0.009747
3Q 0.033797
Max 0.113777
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.026150 0.007159 3.653 0.000546 ***
res1[, 1] -0.028637 0.069572 -0.412 0.682085
res1[, 2] 0.101883 6.791778 0.015 0.988081
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.05681 on 60 degrees of freedom
Multiple R-squared: 0.002831, Adjusted R-squared: -0.03041
F-statistic: 0.08517 on 2 and 60 DF, p-value: 0.9185
Response BTC :
Call:
lm(formula = BTC ~ res1[, 1] + res1[, 2])
Residuals:
BTC
Min -0.5637261
1Q -0.1387570
Median -0.0008323
3Q 0.1597228
Max 0.4907717
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.08055 0.03056 2.636 0.0107 *
res1[, 1] 0.13696 0.29704 0.461 0.6464
res1[, 2] 25.46621 28.99766 0.878 0.3833
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.2426 on 60 degrees of freedom
Multiple R-squared: 0.01567, Adjusted R-squared: -0.01714
F-statistic: 0.4777 on 2 and 60 DF, p-value: 0.6226
Response DOGE :
Call:
lm(formula = DOGE ~ res1[, 1] + res1[, 2])
Residuals:
DOGE
Min -0.91114
1Q -0.26776
Median -0.07433
3Q 0.15503
Max 1.41041
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.13634 0.06082 2.242 0.0287 *
res1[, 1] 0.49162 0.59113 0.832 0.4089
res1[, 2] 117.15420 57.70801 2.030 0.0468 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4827 on 60 degrees of freedom
Multiple R-squared: 0.07257, Adjusted R-squared: 0.04166
F-statistic: 2.348 on 2 and 60 DF, p-value: 0.1043
Response ETH :
Call:
lm(formula = ETH ~ res1[, 1] + res1[, 2])
Residuals:
ETH
Min -0.92742
1Q -0.29318
Median -0.03419
3Q 0.27172
Max 1.12560
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.13298 0.05039 2.639 0.0106 *
res1[, 1] -0.10184 0.48974 -0.208 0.8360
res1[, 2] 39.68206 47.80940 0.830 0.4098
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.3999 on 60 degrees of freedom
Multiple R-squared: 0.01228, Adjusted R-squared: -0.02065
F-statistic: 0.3729 on 2 and 60 DF, p-value: 0.6903
Response LTC :
Call:
lm(formula = LTC ~ res1[, 1] + res1[, 2])
Residuals:
LTC
Min -0.61483
1Q -0.20660
Median -0.03133
3Q 0.17395
Max 0.86763
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.07405 0.04261 1.738 0.0874 .
res1[, 1] -0.02055 0.41412 -0.050 0.9606
res1[, 2] 58.84056 40.42788 1.455 0.1508
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.3382 on 60 degrees of freedom
Multiple R-squared: 0.03427, Adjusted R-squared: 0.002076
F-statistic: 1.064 on 2 and 60 DF, p-value: 0.3513
Response MONERO :
Call:
lm(formula = MONERO ~ res1[, 1] + res1[, 2])
Residuals:
MONERO
Min -0.70392
1Q -0.21173
Median -0.04903
3Q 0.17870
Max 1.82412
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.11126 0.05417 2.054 0.0444 *
res1[, 1] 0.52762 0.52650 1.002 0.3203
res1[, 2] 74.26762 51.39819 1.445 0.1537
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.4299 on 60 degrees of freedom
Multiple R-squared: 0.04746, Adjusted R-squared: 0.01571
F-statistic: 1.495 on 2 and 60 DF, p-value: 0.2325
rsq1 = rep(0,12)
for (i in 1:12){rsq1[i]= slmfit1[[i]][[8]]}
beta_IR = lmfit1$coef[2,]
beta_CCPI = lmfit1$coef[3,]
par(mfrow=c(1,3)) # building three graphs in a row
barplot(rsq1,horiz=T,names=names(beta_IR),main="R squared")
barplot(beta_IR,hori=T,main="beta IR")
barplot(beta_CCPI,hori=T,main="beta CCPI")
The first Economic Factor model is taken on a monthly time scale and as such incurs a great amount of noise that is unaccounted for.
A second Economic Factor model is constructed with slightly more unconventional economic factors but on a daily time scale so as not to lose the information present in the daily time series for returns.
Price of energy per Megawatt hour - As the cost of cryptocurrency mining is inherently linked to the cost of mining cryptocurrency, the price of energy may be a considerable factor in the price. Also as many of the stocks selected have high computing costs, it is of interest to investigate whether energy costs have influence on their stock pricings.
Weighted exchange rate of $USD against the world - A measure of the equivalent buying power of the US dollar relative to the world. This factor is chosen to see how stocks on the New York stock exchange react to shock changes in the weighted value of the US dollar. It’s also selected on the basis of investigating whether the US dollar has any underlying cause for the valuation of cryptocurrencies.
Also to note is the daily frequency of observations for these values. This is also a contributing reason as to why they are selected; so as not to substantially decrease the frequency of observations incurring greater noise and uncertainty into the model.
First the data must be loaded in, cleaned and the daily change of the factors must be calculated. The data transformation of the log of the changes is calculated to make the data more normal and improve lienarity between variables
##------------------UNCONVENTIONAL ECONOMIC FACTOR DATA CLEANING AND PREPARATION------------
## Cost of energy prices gathered from: https://www.kaggle.com/datasets/nicholasjhana/energy-consumption-generation-prices-and-weather
## loading energy price data
energy.prices <- read.csv("energyprices.csv")
## specifying dates
energy.prices$Date <- as.POSIXct(energy.prices$Date, format = "%d/%m/%Y")
## energy prices as xts
energy <- xts(energy.prices$Energy.Price, order.by = energy.prices$Date)
## Exchange rate data gathered from: https://www.bis.org/statistics/xrusd.htm
## loading exchange rates data
rates.df <- read.csv("Exchange.Rates.csv")
##taking only rate of USD against the world, removing qualitative rows, naming columns
ex.rates <- cbind(rates.df$Reference.area, rates.df$XW.World)
ex.rates <- as.data.frame(ex.rates[-c(1:7),])
colnames(ex.rates) <- c("Date", "Rate")
## specifying dates
ex.rates$Date <- as.POSIXct(ex.rates$Date, format = "%d/%m/%Y")
## creating time series
ex.rates.xts <- xts(ex.rates[,2], order.by = ex.rates$Date)
## combining the two time series of economic factors
factors <- cbind(energy, ex.rates.xts)
##omitting na's
factors <- na.omit(factors)
## calculating log changes of both factors
factors$energy <- Return.calculate(factors$energy, method = "log")
factors$ex.rates.xts <- Return.calculate(factors$ex.rates.xts, method = "log")
## statistical description of factors
stat.desc(factors[-1,])
Looking at the R squared for this model, it is clear that it does not explain the variance in the change in prices sufficiently. As the higher R square for Monero measures at only 0.018, accounting for only 1.8% in variance of the change in price for Monero.
Despite this very poor R square value for this model, one interesting finding is the difference in response to unexpected changes in the strength of the US dollar for Ethereum and Monero. This does reflect the attraction of Monero as an independent deregulated and safe form of currency, which is very much a safe haven from the unexpected volatilities of fiat currencies (Godwin, 2023).
model.df <- cbind(factors, returns)
model.df <- na.omit(model.df)
## 340 valid rows from 2015-10-26 to 2018-12-28
## auto-regressive model of two factors
arFit2 <- ar(cbind(model.df$energy, model.df$ex.rates.xts))
##extracting residuals
res2 <- arFit2$resid[5:340,]
## fitting a regression of the returns of the stocks and crypto against the residuals of the economic factors
lmfit2 <- lm(model.df[5:340,3:14]~res2[,1]+res2[,2])
slmfit2 <- summary(lmfit2)
slmfit2
Response TSLA :
Call:
lm(formula = TSLA ~ res2[, 1] + res2[, 2])
Residuals:
TSLA
Min -0.090578
1Q -0.013559
Median -0.001142
3Q 0.014780
Max 0.109859
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.001445 0.001432 1.009 0.314
res2[, 1] 0.004239 0.008345 0.508 0.612
res2[, 2] 0.061173 0.562911 0.109 0.914
Residual standard error: 0.02625 on 333 degrees of freedom
Multiple R-squared: 0.0008227, Adjusted R-squared: -0.005178
F-statistic: 0.1371 on 2 and 333 DF, p-value: 0.8719
Response AMZN :
Call:
lm(formula = AMZN ~ res2[, 1] + res2[, 2])
Residuals:
AMZN
Min -0.0774516
1Q -0.0079010
Median 0.0003749
3Q 0.0097951
Max 0.0942587
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0009015 0.0011369 0.793 0.428
res2[, 1] -0.0106498 0.0066239 -1.608 0.109
res2[, 2] 0.0892012 0.4468363 0.200 0.842
Residual standard error: 0.02084 on 333 degrees of freedom
Multiple R-squared: 0.007763, Adjusted R-squared: 0.001803
F-statistic: 1.303 on 2 and 333 DF, p-value: 0.2732
Response AAPL :
Call:
lm(formula = AAPL ~ res2[, 1] + res2[, 2])
Residuals:
AAPL
Min -6.653e-02
1Q -6.718e-03
Median 7.764e-05
3Q 8.841e-03
Max 7.123e-02
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.0004101 0.0009081 -0.452 0.652
res2[, 1] -0.0060079 0.0052906 -1.136 0.257
res2[, 2] 0.0199253 0.3568931 0.056 0.956
Residual standard error: 0.01665 on 333 degrees of freedom
Multiple R-squared: 0.003858, Adjusted R-squared: -0.002125
F-statistic: 0.6449 on 2 and 333 DF, p-value: 0.5254
Response META :
Call:
lm(formula = META ~ res2[, 1] + res2[, 2])
Residuals:
META
Min -0.0724668
1Q -0.0086499
Median 0.0005733
3Q 0.0101195
Max 0.1550199
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0001294 0.0011008 0.118 0.906
res2[, 1] -0.0094660 0.0064131 -1.476 0.141
res2[, 2] 0.2669782 0.4326112 0.617 0.538
Residual standard error: 0.02018 on 333 degrees of freedom
Multiple R-squared: 0.007445, Adjusted R-squared: 0.001484
F-statistic: 1.249 on 2 and 333 DF, p-value: 0.2882
Response GOOG :
Call:
lm(formula = GOOG ~ res2[, 1] + res2[, 2])
Residuals:
GOOG
Min -0.0503783
1Q -0.0062031
Median 0.0001847
3Q 0.0083292
Max 0.0648652
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0001927 0.0008357 0.231 0.818
res2[, 1] -0.0038732 0.0048689 -0.796 0.427
res2[, 2] 0.1370025 0.3284452 0.417 0.677
Residual standard error: 0.01532 on 333 degrees of freedom
Multiple R-squared: 0.002349, Adjusted R-squared: -0.003643
F-statistic: 0.3921 on 2 and 333 DF, p-value: 0.676
Response NFLX :
Call:
lm(formula = NFLX ~ res2[, 1] + res2[, 2])
Residuals:
NFLX
Min -0.0816279
1Q -0.0142296
Median -0.0005355
3Q 0.0140537
Max 0.0988555
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.001399 0.001382 1.012 0.312
res2[, 1] -0.011972 0.008049 -1.487 0.138
res2[, 2] -0.185907 0.542992 -0.342 0.732
Residual standard error: 0.02532 on 333 degrees of freedom
Multiple R-squared: 0.007063, Adjusted R-squared: 0.001099
F-statistic: 1.184 on 2 and 333 DF, p-value: 0.3073
Response MSFT :
Call:
lm(formula = MSFT ~ res2[, 1] + res2[, 2])
Residuals:
MSFT
Min -0.0516432
1Q -0.0071726
Median 0.0001211
3Q 0.0082643
Max 0.0681653
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.0004219 0.0008308 0.508 0.612
res2[, 1] -0.0039080 0.0048401 -0.807 0.420
res2[, 2] 0.1250301 0.3265030 0.383 0.702
Residual standard error: 0.01523 on 333 degrees of freedom
Multiple R-squared: 0.002329, Adjusted R-squared: -0.003663
F-statistic: 0.3887 on 2 and 333 DF, p-value: 0.6782
Response BTC :
Call:
lm(formula = BTC ~ res2[, 1] + res2[, 2])
Residuals:
BTC
Min -0.212629
1Q -0.021651
Median 0.001555
3Q 0.020849
Max 0.249921
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.002815 0.002958 0.952 0.342
res2[, 1] -0.015433 0.017233 -0.896 0.371
res2[, 2] -0.293128 1.162527 -0.252 0.801
Residual standard error: 0.05422 on 333 degrees of freedom
Multiple R-squared: 0.002644, Adjusted R-squared: -0.003346
F-statistic: 0.4414 on 2 and 333 DF, p-value: 0.6435
Response DOGE :
Call:
lm(formula = DOGE ~ res2[, 1] + res2[, 2])
Residuals:
DOGE
Min -0.483544
1Q -0.036620
Median -0.009632
3Q 0.019955
Max 0.698793
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.008086 0.005278 1.532 0.126
res2[, 1] 0.033136 0.030749 1.078 0.282
res2[, 2] -1.068394 2.074293 -0.515 0.607
Residual standard error: 0.09674 on 333 degrees of freedom
Multiple R-squared: 0.004153, Adjusted R-squared: -0.001828
F-statistic: 0.6943 on 2 and 333 DF, p-value: 0.5001
Response ETH :
Call:
lm(formula = ETH ~ res2[, 1] + res2[, 2])
Residuals:
ETH
Min -0.24781
1Q -0.04231
Median -0.01380
3Q 0.03035
Max 0.65185
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.014622 0.004935 2.963 0.00327 **
res2[, 1] -0.010296 0.028752 -0.358 0.72051
res2[, 2] -1.860643 1.939572 -0.959 0.33810
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.09046 on 333 degrees of freedom
Multiple R-squared: 0.003216, Adjusted R-squared: -0.002771
F-statistic: 0.5372 on 2 and 333 DF, p-value: 0.5849
Response LTC :
Call:
lm(formula = LTC ~ res2[, 1] + res2[, 2])
Residuals:
LTC
Min -0.215423
1Q -0.026111
Median -0.005071
3Q 0.016859
Max 0.691689
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.003552 0.004110 0.864 0.388
res2[, 1] -0.038808 0.023946 -1.621 0.106
res2[, 2] 0.210707 1.615340 0.130 0.896
Residual standard error: 0.07534 on 333 degrees of freedom
Multiple R-squared: 0.007841, Adjusted R-squared: 0.001882
F-statistic: 1.316 on 2 and 333 DF, p-value: 0.2696
Response MONERO :
Call:
lm(formula = MONERO ~ res2[, 1] + res2[, 2])
Residuals:
MONERO
Min -0.25695
1Q -0.04845
Median -0.01231
3Q 0.03882
Max 0.43537
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.012519 0.004903 2.553 0.0111 *
res2[, 1] -0.065699 0.028565 -2.300 0.0221 *
res2[, 2] 2.049735 1.926935 1.064 0.2882
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.08987 on 333 degrees of freedom
Multiple R-squared: 0.01844, Adjusted R-squared: 0.01254
F-statistic: 3.127 on 2 and 333 DF, p-value: 0.04512
## extracting and plotting the influence the residuals have on each return anda then plotting them and the betas
rsq2 = rep(0,12)
for (i in 1:12){rsq2[i]= slmfit2[[i]][[8]]}
beta_Energy = lmfit2$coef[2,]
beta_E_Rate = lmfit2$coef[3,]
par(mfrow=c(1,3))
barplot(rsq2,horiz=T,names=names(beta_Energy),main="R squared")
barplot(beta_Energy,hori=T,main="beta Energy")
barplot(beta_E_Rate,hori=T,main="beta Exchange Rate")
To create a monte-carlo simulation to forecast the stock and crypto prices we require a current stock price, expected return, volatility, a simulation period and a number of simulations. Adapting existing code for a monte-carlo simulation for yearly data to the daily data and then running the simulation 10000 times a simulation for the prices of each of the financial assets can be calculated for a year from the current price.
It’s not surpise that the cryptocurrencies had the largest increase. This is due to their higher average return over the measured period and the monte-carlo simulation not accounting for non-normal data.
However, as the distribution for the daily returns has been proved to be non-normal, the results of this simulation can be considered to be innaccurate and not a sufficient prediction for future prices.
## statistics for the daily returns of the selected assets
stats <- stat.desc(returns)
# Set parameters (total.df is the dataframe that contains the complete rows dataset of prices)
current_price = total.df[1486,]
annual_return = stats[9,]*252
annual_volatility = stats[13,]*sqrt(252)
simulation_period = 1
num_simulations = 10000
# Define function to simulate stock prices
simulate_stock_prices = function(current_price, annual_return, annual_volatility, simulation_period, num_simulations) {
# Calculate daily return and daily volatility
daily_return = annual_return / 252
daily_volatility = annual_volatility / sqrt(252)
# Generate random daily returns using normal distribution
random_daily_returns = matrix(rnorm(num_simulations * 252, mean = daily_return, sd = daily_volatility), nrow = num_simulations, ncol = 252)
# Calculate simulated stock prices
simulated_prices = t(apply(1 + random_daily_returns, 1, cumprod)) * current_price
return(simulated_prices)
}
##create object to store the predicted prices in
expected_future_price <- matrix(ncol=length(current_price),nrow=1)
colnames(expected_future_price) <- colnames(total.df)
## Run the simulation
for (i in 1:length(current_price))
{
simulated_prices = simulate_stock_prices(as.numeric(current_price[,i]), as.numeric(annual_return[,i]), as.numeric(annual_volatility[,i]), simulation_period, num_simulations)
# Calculate expected future price
expected_future_price[,i] = mean(simulated_prices[, 252])
}
print(expected_future_price)
TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
[1,] 417.7315 253.117 189.5357 467.5733 169.1334 747.5808 389.4894 100109.5 3.618803 17278.82 440.6639 1200.701
A potentially different distribution to use for the monte-carlo simulation is the student’s distribution as this is a generalisation of the normal distribution that captures both heavy tails and skewness, which have been shown to be present in the distribution of returns.
Setting the degrees of freedom to be 10 and the non-centrality parameter to be equal to mean standard deviation of all the assets’ returns. These paraemeters, though not perfect may better represent the distribution of the returns
These results, although not necessarily accurate can be considered to be a prediction more reflective of the distribution of the returns.
The predicted prices show a large increase on the price of the cryptocurrencies which is most likely a reflection of their higher average return over the measured period. The predicted prices for the stocks also show a large increase as reflected in their positive mean return but not to the same degree as the cryptocurrenices.
## statistics for the daily returns of the selected assets
stats <- stat.desc(returns)
# Set parameters (total.df is the dataframe that contains the complete rows dataset of prices)
current_price = total.df[1486,]
annual_return = stats[9,]*252
annual_volatility = stats[13,]*sqrt(252)
simulation_period = 1
num_simulations = 10000
# Define function to simulate stock prices
simulate_stock_prices = function(current_price, annual_return, annual_volatility, simulation_period, num_simulations) {
# Calculate daily return and daily volatility
daily_return = annual_return / 252
daily_volatility = annual_volatility / sqrt(252)
# Generate random daily returns using normal distribution
random_daily_returns = matrix(rt(num_simulations * 252, df=20, ncp=0.04710566)*daily_volatility, nrow = num_simulations, ncol = 252)
# Calculate simulated stock prices
simulated_prices = t(apply(1 + random_daily_returns, 1, cumprod)) * current_price
return(simulated_prices)
}
##create object to store the predicted prices in
expected_future_price <- matrix(ncol=length(current_price),nrow=1)
colnames(expected_future_price) <- colnames(total.df)
## Run the simulation
for (i in 1:length(current_price))
{
simulated_prices = simulate_stock_prices(as.numeric(current_price[,i]), as.numeric(annual_return[,i]), as.numeric(annual_volatility[,i]), simulation_period, num_simulations)
# Calculate expected future price
expected_future_price[,i] = mean(simulated_prices[, 252])
}
print(expected_future_price)
TSLA AMZN AAPL META GOOG NFLX MSFT BTC DOGE ETH LTC MONERO
[1,] 356.0674 221.9699 176.9274 459.417 157.4803 732.861 344.8616 60778.45 1.30289 5552.273 317.7273 586.2319
The results of the Analysis from this project can conclude that the financial instruments of cryptocurrencies behave in a significantly different way to the traditional financial assets examined. They are found to have had higher overall returns and volatility than the traditional assets. Interestingly the correlation between the traditional assets selected and the selected cryptocurrencies is quite comparable indicating some homogeneity between the traditional assets and the cryptocurrencies.
Although the factor models were found to be insufficient in explaining the variance of any of the selected assets, they did give some credence to further investigation and potentially important findings for the relationships between the assets themselves and some economic factors.
References
Blockworks. (2023, March 16). The Investor’s Guide to Crypto Correlation. Blockworks. https://blockworks.co/news/the-investors-guide-to-crypto-correlation
Davidson, P., & Jones, C. (2023, May 10). CPI report live updates: Inflation dips to 4.9%; core consumer price gains stay elevated. USA TODAY. https://eu.usatoday.com/story/money/2023/05/10/cpi-report-data-inflation-live-updates/70200946007/#:~:text=The%20core%20consumer%20price%20index
Godwin, P. U. (2023, May 10). How Crypto Reacts to Changes on the Consumer Price Index. Tekedia. https://www.tekedia.com/how-crypto-reacts-to-changes-on-the-consumer-price-index/
Hern, A. (2014, January 20). It’s bobsleigh time: Jamaican team raises $25,000 in Dogecoin. The Guardian. https://www.theguardian.com/technology/2014/jan/20/jamaican-bobsled-team-raises-dogecoin-winter-olympics#:~:text=A%20group%20of%20supporters%20has
Jacoby, J. (2020, February 18). Amazon Empire: The Rise and Reign of Jeff Bezos. FRONTLINE. https://www.pbs.org/wgbh/frontline/documentary/amazon-empire/
Pandy, S. (2023, April 19). Netflix’s Market Share Decline Continues In 2023: Analysis Of Leading Streaming Platforms. Similarweb. https://www.similarweb.com/blog/insights/media-entertainment-news/streaming-q1-2023/
Reuters. (2022, May 27). Musk sued by Twitter investors for stock “manipulation” during takeover bid. Reuters. https://www.reuters.com/markets/deals/musk-sued-by-twitter-investors-delayed-disclosure-stake-2022-05-26/
Sephton, C. (2022, August 10). Monero vs Bitcoin | What is The Difference? Currency.com. https://currency.com/monero-vs-bitcoin-the-pros-and-cons
Zuboff, S. (2019, September 5). How Google Discovered the Value of Surveillance. Longreads. https://longreads.com/2019/09/05/how-google-discovered-the-value-of-surveillance/